SCRAMBLE’N’GAMBLE: a tool for fast and facile generation of random data for statistical evaluation of QSAR models

نویسندگان

  • Piotr F J Lipiński
  • Przemysław Szurmak
چکیده

A common practice in modern QSAR modelling is to derive models by variable selection methods working on large descriptor pools. As pointed out previously, this is intrinsically burdened with the risk of finding random correlations. Therefore it is desirable to perform tests showing the performance of models built on random data. In this contribution, we introduce a simple and freely available software tool SCRAMBLE'N'GAMBLE that is aimed at facilitating data preparation for y-randomization and pseudo-descriptors tests. Then, four close-to-real-world modelling situations are analysed. The tests indicate what the quality of obtained QSAR models is like in comparison to chance models derived from random data. The non-randomness is not the only requirement for a good QSAR model, however, it is a good practice to consider it together with internal statistical parameters and possible physical interpretations of a model.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A comparative QSAR study of aryl-substituted isobenzofuran-1(3H)-ones inhibitors

A comparative workflow, including linear and non-linear QSAR models, was carried out to evaluate the predictive accuracy of models and predict the inhibition activity of a series of aryl-substituted isobenzofuran-1(3H)-ones. The data set consisted of 34 compounds was classified into the training and test sets, randomly. Molecular descriptors were selected using the genetic algorithm (GA) as a f...

متن کامل

A Novel QSAR Model for the Evaluation and Prediction of (E)-N’-Benzylideneisonicotinohydrazide Derivatives as the Potent Anti-mycobacterium Tuberculosis Antibodies Using Genetic Function Approach

Abstract A dataset of (E)-N’-benzylideneisonicotinohydrazide derivatives as a potent anti-mycobacterium tuberculosis has been investigated utilizing Quantitative Structure-Activity Relationship (QSAR) techniques. Genetic Function Algorithm (GFA) and Multiple Linear Regression Analysis (MLRA) were used to select the descriptors and to generate the correlation QSAR models that relate the Mi...

متن کامل

QSAR Study of 17β-HSD3 Inhibitors by Genetic Algorithm-Support Vector Machine as a Target Receptor for the Treatment of Prostate Cancer

The 17β-HSD3 enzyme plays a key role in treatment of prostate cancer and small inhibitorscan be used to efficiently target it. In the present study, the multiple linear regression (MLR),and support vector machine (SVM) methods were used to interpret the chemical structuralfunctionality against the inhibition activity of some 17β-HSD3inhibitors. Chemical structuralinformation were described thro...

متن کامل

QSAR Study of 17β-HSD3 Inhibitors by Genetic Algorithm-Support Vector Machine as a Target Receptor for the Treatment of Prostate Cancer

The 17β-HSD3 enzyme plays a key role in treatment of prostate cancer and small inhibitorscan be used to efficiently target it. In the present study, the multiple linear regression (MLR),and support vector machine (SVM) methods were used to interpret the chemical structuralfunctionality against the inhibition activity of some 17β-HSD3inhibitors. Chemical structuralinformation were described thro...

متن کامل

Performance evaluation of EPM and MPSIAC Models for determination of Erosion Status of Shahriari Watershed

Soil erosion is one of the most important environmental issues in developing countries, including Iran that there is inaccurate information about its amount and distribution. For this purpose, the accuracy and distribution of erosion classes obtained from EPM and MPSIAC models as compared to BLM as ground truth values were evaluated in Shahriari watershed. First, the required data and informati...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره 71  شماره 

صفحات  -

تاریخ انتشار 2017